Aggregation of Multiple Judgments for Evaluating Ordered Lists
نویسندگان
چکیده
Many tasks (e.g., search and summarization) result in an ordered list of items. In order to evaluate such an ordered list of items, we need to compare it with an ideal ordered list created by a human expert for the same set of items. To reduce any bias, multiple human experts are often used to create multiple ideal ordered lists. An interesting challenge in such an evaluation method is thus how to aggregate these different ideal lists to compute a single score for an ordered list to be evaluated. In this paper, we propose three new methods for aggregating multiple order judgments to evaluate ordered lists: weighted correlation aggregation, rank-based aggregation, and frequent sequential pattern-based aggregation. Experiment results on ordering sentences for text summarization show that all the three new methods outperform the state of the art average correlation methods in terms of discriminativeness and robustness against noise. Among the three proposed methods, the frequent sequential pattern-based method performs the best due to the flexible modeling of agreements and disagreements among human experts at various levels of granularity.
منابع مشابه
2-tuple intuitionistic fuzzy linguistic aggregation operators in multiple attribute decision making
In this paper, we investigate the multiple attribute decisionmaking (MADM) problems with 2-tuple intuitionistic fuzzylinguistic information. Then, we utilize arithmetic and geometricoperations to develop some 2-tuple intuitionistic fuzzy linguisticaggregation operators. The prominent characteristic of theseproposed operators are studied. Then, we have utilized theseoperators to develop some app...
متن کاملEvaluation Method for Feature Rankings and their Aggregations for Biomarker Discovery
In this paper we investigate the problem of evaluating ranked lists of biomarkers, which are typically an output of the analysis of high-throughput data. This can be a list of probes from microarray experiments, which are ordered by the strength of their correlation to a disease. Usually, the ordering of the biomarkers in the ranked lists varies a lot if they are a result of different studies o...
متن کاملHesitant Fuzzy Linguistic Arithmetic Aggregation Operators in Multiple Attribute Decision Making
In this paper, we investigate the multiple attribute decision making (MADM) problem based on the arithmetic and geometric aggregation operators with hesitant fuzzy linguistic information. Then, motivated by the idea of traditional arithmetic operation, we have developed some aggregation operators for aggregating hesitant fuzzy linguistic information: hesitant fuzzy linguistic weighted average (...
متن کاملExtended and infinite ordered weighted averaging and sum operators with numerical examples
This study discusses some variants of Ordered WeightedAveraging (OWA) operators and related information aggregation methods. Indetail, we define the Extended Ordered Weighted Sum (EOWS) operator and theExtended Ordered Weighted Averaging (EOWA) operator, which are applied inscientometrics evaluation where the preference is over finitely manyrepresentative works. As...
متن کاملCrowdsourcing Preference Judgments for Evaluation of Music Similarity Tasks
A BST R A C T Music similarity tasks, where musical pieces similar to a query should be retrieved, are quite troublesome to evaluate. Ground truths based on partially ordered lists were developed to cope with problems regarding relevance judgment, but they require such man-power to generate that the official MIREX evaluations had to turn over more affordable alternatives. However, in house eval...
متن کامل